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ABSTRACT 


^la  Investigation  studies  optimal  control  of  linear  saopled-data 
systems  idiere  the  control  Is  subject  to  saturation.  The  system  Is  de¬ 
scribed  by  the  state-space  method.  The  control  Is  considered  to  be 
optlnml  idien  It  minimizes  a  performance  Index  which  Is  defined  as  a  sum 
over  the  sampling  Instants  of  a  quadratic  function  of  the  states  and 
controls,  -n 

The  u^^tlon  begins  with  the  Principle  of  Optimality.  A  form  Is 
assume  for  the  optimal  return  function,  and  recurrence  relations  are 
denied  for  the  one-input  case  \rtiich  are  different  depending  on  whether 
ue  optimal  control  Is  or  is  not  saturated.  The  optimal  control  Is 
shown  to  be  a  piecewise  linear  function  of  the  states.  A  confuting 
method  that  uses  the  recurrence  relations  to  solve  the  Infinite  stage 
regulator  problem  is  presented  and  discussed  In  detail.  This  method 
requires  less  computer  time  and  memoiy  than  would  straight  dynamic  pro- 


Both  one-  and  two-input  control  are  considered.  The  two-input  case 
requires  a  third  set  of  recurrence  relations  for  use  when  one  input  is 
saturated  and  the  other  Is  not.  More  Inputs  can  be  handled  using  the 
same  methods,  but  the  complexity  Increases  rapidly  with  the  number  of 
Inputs.  A  detailed  discussion  of  a  simple  method  for  finding  the  mini¬ 
mum  of  a  positive  definite  quadratic  function  in  two  variables  subject 
to  the  constraint  that  the  minimum  be  on  or  within  a  rectangle  is  pre¬ 
sented. 

Four  examples  showing  the  optimal  control  of  second-order  systems 
determined  by  the  computing  method  given  in  this  report  are  presented 
and  dlsciissedw 
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A.  QUTLHI  OP  THE  BROBIJM 


As  an  exHtplfl  of  tbe  prbblan  Investigated  in  ttais  report^  eonslder 
a  apace  vehlele  whose  attitude  is  to  be  controlled  by  reaction  wheels 
or  gyros.  Various  distuxbances,  perhaps  inpulslve  as  fron  eoUisions 
with  mlcroaeteorltes,  produce  an  error  in  attitude  which  must  be  cor¬ 
rected.  In  applying  the  control,  the  Integral  over  time  of  the  attitude 
squared-error  pl\is  the  squared-control  is  to  be  Bininized. 

Because  the  control  torque  is  subject  to  saturation,  a  nonllneari'^ 
is  inherent  in  the  systea.  Thus  it  is  expected  that  the  optlaud  ccmtrol 
will  be  a  nonlinear  function  of  tbe  states  of  the  system— the  attitude 
error  and  velocity.  This  optimal  control  function  is  to  be  stored  in  a 
small  special-purpose  digital  ccnoputer  cedled  a  digital  controller. 

Ihe  introduction  of  a  computer  makes  tbe  system  saaipled-data.  The 
computer  determines  from  the  states  of  the  system  at  each  sampling  in¬ 
stant  the  correct  optimal  control  to  apply  over  the  next  sampling 
interval. 

Since  the  system  is  now  saopled-data,  rather  than  minimize  an  inte¬ 
gral  It  is  logical  and  ccnxvenlent  to  minimize  tbe  sm  over  time  of  the 
attitude  squared-error  plus  the  squared-control  at  the  seopllng  Instants. 

More  generally,  this  investigation  studies  optimal  control  of  linear 
s8B(pled-data  systeu  where  the  control  is  subject  to  saturation.  Tlw 
system  will  be  described  by  tbe  state-space  method  developed  by  Kalman 
end  Bertram  [Ref.  l]. 

At  each  sasipllng  instant  tbe  system  is  assigned  a  performance  number, 
which  is  a  quadratic  function  of  the  state  error  and  cojtrol.  The  sun 
of  the  perfozmance  nunbers  over  a  given  number  of  sanples  is  called  the 
performance  index.  Only  the  transient  regulator  problm— that  of  finding 
the  control  sequence  which,  from  a  given  initial  condltloi  with  no  ex¬ 
ternal  dlaturbances  and  no  commands,  minimizes  'tiie  performance  index- 
will  be  considered. 


9m  Sdmp.liai  rate  is  often  feat  enou^  that  a  sampled-data  qrstam 

-.ljl9ljjp|f  for  the  purpose  at  analysis,  by  a  continuous 
i  however,  the  s«pling  rate  is  oonsidered 


1 


to  be  slow  Boaa^  that  the  aaiiq>ling  procees  introduces  sl^ilflcant 
effects  into  the  perfoxnance  of  the  system. 

B.  SUMMARY  OF  REIATED  WORK 

Ihere  is  a  considerable  body  of  literature  on  the  siibject  of  optimal 
control  of  sampled-data  systans,  but  almost  no  mention  of  the  specific 
problon  presented  here. 

Using  the  Principle  of  Optimality,  the  problem  without  constraints 
on  the  control  was  solved  by  Kalman  and  Koepcke  [Ref.  2].  !Ihey  show 
that  for  the  infinite  stage  regulator  problem  the  optimal  control  takes 
the  form  of  stationary,  linear  feedback  gains.  Work  on  this  problas 
was  also  done  by  Henry  [Ref.  3]> 

Severed  researchers  have  worked  on  the  problan  investigated  here, 
though  using  minimum  time  response  as  the  criterion  of  optimality. 

Among  these  are  Kurzwell  [Ref.  4],  Desoer  and  Wing  [Ref.  ^],  and  Kedman 
[Ref.  6]. 

Merrian  [Refs.  "J,  8],  using  his  peurametrlc  expeinslon  method,  has 
studied  the  problem  in  the  continuous  case. 

Bellman's  computational  method  of  dynamic  programming  [Ref.  9]  solves, 
among  others,  problems  of  the  type  studied  here  when  the  dimension  of 
the  state  vector  is  small.  Ihe  special  problan  of  this  report,  minus 
constraints  on  the  control  is  mentioned  by  Bellman  and  Dreyfus  [Ref.  lO]. 

Quadratic  performance  criteria  have  been  used  by  many  researchers  In 
both  the  continuous  and  sampled -data  cases. 

The  state -space  method  of  describing  linear  sampled-data  systems  is 
disc\issed  in  detail  by  Kalman  and  Bertram  [Ref.  l],  Kalman  [Ref.  U], 
Ounckel  [Ref.  12],  and  Rauch  [Ref.  I3]. 

C.  0U!I!U;NE  of  new  RESUL3S 

For  the  first  time  in  the  literature  the  problem  of  computing  the 
optimal  feedback  coefficients  of  a  saoqpled-data  system  with  bounded  con¬ 
trol  using  queuiratlc  performance  criteria  is  discussed  in  detail. 

In  Chapter  II  a  mathematical  deacriptloi  of  the  system  and  the  per¬ 
formance  criterion  is  given,  end  the  problem  foxmulatica  is  presented. 

!IVo  examples  using  this  foxmolatiQB  «re  dlsousaed. 
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Recurrence  relations  necessary  to  the  computation  method  are  derived 
in  Chapter  HI  for  the  single-control  case.  The  optimal  control  is 
shown  to  he  a  piecevise-linear  function  of  the  states. 

In  Chapter  IV  a  general  computing  method  is  presented  for  the  single, 
input  case,  and  problems  connected  with  the  computations  are  discussed 
in  detail.  The  method  is  edso  compared  with  dynamic  programming.  It  is 
shown  that,  because  it  takes  advantage  of  the  information  contained  in 
the  recurrence  relations,  the  method  developed  here  requires  much  less 
computer  time  and  memory  than  would  dynamic  programming. 

Chapter  V  extends  the  work  to  the  case  where  the  system  has  two  con¬ 
trolling  inputs.  Extension  to  systems  with  more  inputs  presents  no 
formal  difficulties  but  is  not  disc\issed  due  to  Its  complexity. 

Results  of  con^uter  solutions  of  four  examples  are  presented  and 
dlscvissed  in  Chapter  VI. 
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n.  SIATEMBNI  OF  THE  PROBLEM 


This  Investigation  considers  those  sampled-data  systems  that  can  be 
adequately  described  by  linear .difference  equations.  These  equations 
will  be  written  In  the  state-space  form  used  by  Kalman  and  others.  For 
conciseness,  vector-matrlx  notation  will  be  used  throuj^iout. 

A.  THE  SYSTEM 

The  plant,  or  system  to  be  controlled,  is  described  by  the  linear 
vector-difference  equation 

z(n  +  l)  =  (fr£(n)  +  Au(n)  (2.1) 


and  the  vector  equation 


^(n)  =  Mz(n), 


(2.2) 


where  z(n)  Is  an  (m  x  l)  state  vector, 
y(n)  is  a  (p  X  l)  output  vector, 
u(n)  is  a  (q  X  l)  input  (control)  vector, 
d>  is  an  (m  X  m)  transition  matrix, 

A  Is  an  (m  X  q)  distribution  matrix, 

M  Is  a  (p  X  m)  output  matrix. 

All  vectors  are  considered  to  be  column  vectors.  Row  vectors  will 

T 

be  written,  for  example,  as  z  (n),  whe:re  T  denotes  the  transpose 
operation.  j^(n)  is  the  measurable  output  vector.  If  edl  the  states 
are  directly  measurable,  then  M  is  the  identity  matrix. 

Since  physically  the  control  variables  cannot  be  unbounded,  each 
element  of  the  control  vector  u(n)  is  bounded  from  below  by  the 
corresponding  element  of  a  vector  a"  and  from  above  by  the  vector 
a^.  That  is. 


g“  ^  u(n)  ^  a^. 


(2.3) 


The  control  vector  u(n)  will  have  dimension  one  (i.e.,  it  will  be 
a  scalar)  in  Chapters  ni  and  IV.  Chapter  V  will  extend  the  results  to 
hle^er  dlmeoaional  u(n). 
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B.  THE  PERFOBMANCE  CRITERION 


If  a  system  is  to  be  optimized,  some  criterion  must  be  chosen  that 
determines  how  well  the  system  is  operating.  In  this  investigation  a 
single  number  that  characterizes  overall  performance  is  assigned  to  the 
system  at  each  sampling  instant.  This  number,  called  the  performance 
number  is  defined  to  be  a  quadratic  function  of  the  difference 

between  the  actual  output  of  the  system,  y(n),  and  the  constant  desired 
output  ^  plus  a  quadratic  cost  on  the  control  required  to  achieve 
the  output.  Mathematically  this  is 

\  =  [y(n)  -  ^]'^Q'[y(n)  -  ^]  +  -  l)ru(n  -  l),  (2.4) 

where  Q'  and  r  are  positive  semidefinlte  symmetric  matrices.  With 
no  further  loss  in  generality  let  ^  =  0* 

can  be  stated  in  terms  of  z(n)  by  using  Eq.  (2.2). 

“  £^(n)Q£(n)  +  u^(n  -  l)ru(n  -  l),  (2.5) 

where  Q  is  a  symmetric  positive  semidefinlte  matrix  defined  by 

Q  =  (2.6) 

Given  an  initial  condition  z(0),  the  control  is  considered  optimal 
if  it  minimizes  in  N  stages  the  sum  of  the  costs  Y^.  This  sum,  called 
the  performance  index,  is  denoted  by  Jjj[£(0)]' 

N 

Jjj[£(0)]  [z^(n)Qz(n)  +  ^"^(n  -  l)ru(n  -  1)]  (2.7) 

n=l 

Although  the  performance  index  is  limited  to  quadratic  functions, 
many  useful  problems  can  be  formulated  using  criteria  of  this  type. 
Integral-squared-error  has  been  used  with  continuous  systems  for  some 
time,  and  sum-sq\iared-error  is  a  logical  extension  to  use  with  sampled- 
data  systems.  The  above  formulation  allows  not  only  squared-error 
terms,  but  also  cross-products  between  the  states,  to  be  charged.  Often 
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the  energy  used  for  control  oust  be  conserved,  and  the  charge  on  squared- 
control  allows  for  this.  Squared  terms  also  provide  a  simple  euialytlcal 
approximation  to  absolute  value. 

The  principal  concern  of  this  Investigation  Is  the  Infinite  stage 
regulator  problem;  thus  the  performance  Index  is 

(2-8) 

C.  THE  PROBLEM  STATEMENT 

The  problem  can  now  be  precisely  stated:  Given  the  system  defined 
by  the  linear  vector-difference  Eq.  (2.1),  and  given  the  bounds  on  the 
control  defined  by  the  vector-inequality  (2.3),  find  for  all  initial 
conditions  ^(O)  the  control  sequences  u(z(0)  ],  u[z(l)],  u[z(2)],  ... 

that  minimize  the  performance  index  J  [z(o)]. 

00  • 

Finding  the  optimal  control  for  all  states  distinguishes  the  con¬ 
trol  problem  from  the  optimal  trajectory  problem.  In  the  latter 
usually  only  one  or  a  few  initial  states  are  of  interest. 

D.  EXAMPLES 

Two  examples  of  the  above  formulation  will  be  given.  The  solution 
to  these  examples  will  be  discussed  in  Chapter  VI. 

For  the  first  example  consider  a  space  vehicle  whose  attitude  is 
to  be  controlled  to  an  inertially  fixed  reference  direction  by  reaction 
wheels.  In  its  simplest  formulation  the  snail  angular  motion  of  the 
vehicle  about  a  principal  axis  can  be  studied  by  considering  the  vehicle 
as  an  inertia  with  moment  of  inertia  I  about  that  axis  [Ref.  lit]. 

The  state  variables  are  the  attitude  error  Q  and  its  derivative 
0  =  d0/dt.  The  sampling  interval  is  t  seconds  long,  and  the  control 
is  held  constant  over  the  sampling  interval  by  a  zero-order  hold  [Ref. 
15].  The  system  is  shown  in  Fig.  1. 

The  equations  of  motion  are 

(2.9) 
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FIG.  1.  BLOCK  DIAGRAM  OF  SYSTEM  IN  FIRST  EXAMPLE. 


The  hold  takes  the  value  of  u*  at  the  sampling  Instant  and  holds  It 
constant  at  that  value  until  the  next  sampling  Instant.  That  Is, 


u(t)  =  u*(nt),  for  n't  <  t  <  (n  +  1)t.  (2.10) 

Solving  Eqs.  (2.9)  for  Zj^f(n  +  1)t  1  and  Z2[(n  +  1)t  ]  In  terms 
of  Zj^(m),  Z2(nT),  and  u(nT)  gives  the  ♦  and  A  matrices.  A  simple 
way  to  determine  these  matrices  is  to  let,  one  at  a  tlme>  an  independent 
variable  z^^,  or  u  at  time  nx  be  unity  while  the  others  are 
zero  and  solve  for  the  dependent  variables  z^  and  z^  at  time 
(n  +  1)t.  Thus,  for  example,  let  the  Laplace  transform  of  u(t  -  nx) 
be  U(s)  =  l/s  and  solve  for  Zj^(s),  which  is 

Z  (s)  =  U(s)  =  (2.11) 

^  Is  Is^ 

The  inverse  transform  is 

2 

Z^{t  -  nx)  =  (2.12) 

Letting  t  =  (n  +  l)x  gives 

^2 

5ii(x)  =  Jj.  (2.13) 


In  the  same  manner  the  other  elements  of  the  $  and  A  matrices 


C8U1  be  found.  These  are 


(2.14) 
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A  performance  criterion  needs  to  be  chosen.  M  is  the  identity 
matrix,  which  means  both  attitude  error  and  its  rate  of  change  can  be 
measured  directly.  Assume  the  performance  number  is  the  sum  of  the 
attitude  squared-error  and  the  squared-control.  Furthermore,  assume 
the  cost  of  an  error  in  attitude  is  to  be  wei^ted  equally  with  the  cost 
of  control.  Thus 


Q  = 


1 

0 


r  =  1. 


(2.15) 


The  problem  then  is:  For  each  initial  condition  z{0)  find  the 
control  sequence  u(0),  u(l),  ...  that  minimizes  the  performance  index 

00 

J^[z(0)]  =  Y  +  u^[(n  -  1)^]}-  (2.l6) 

n=l 


From  here  on,  to  conform  with  the  original  problem  statement,  the 
T  will  be  dropped  from  the  arguments,  with  no  implication  that  t  =  l. 

As  a  second  example  consider  an  artificial  satellite  orbiting  the 
earth.  Using  small  angle  approximations  and  neglecting  other  terms  of 
small  magnitude,  the  pitch  equations  of  motion  are  decoupled  from  roll 
and  yaw.  The  vehicle  can  be  described  in  pitch  as  an  inertia  with 
moment  of  inertia  I  [Refs.  l4,  l6].  An  important  external  force  acting 
on  the  satellite  is  exerted  by  the  gravity  gradient.  For  small  values 
of  e,  this  force  is  proportional  to  the  attitude  error  9  with 
constant  of  proportionality  k,  as  shown  in  Fig.  2. 


FIC.  2.  BLOCK  DIAGRAM  OF  SYSTEM  IN  SECOND  EXAMPLE. 
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From  Pig.  2  the  equations  of  motion  can  be  written  down  by  inspection. 


=  ZgCt) 

ZgCt)  =  -(k/l)z3^(t)  +  u(t)/l  (2.17) 

For  this  example  the  method  of  finding  will  be  shown  in  detail. 

Conceptually  it  is  easier  in  this  case  to  consider  the  transfer  function 
from  ZgCs)  to  Zj^(s)  and  let  Z2(n)  be  the  delta  function.  This  has 
the  effect  of  making  z^(n)  =  1  as  desired.  Thus 

Zi(8)  =  -5—2 - g  7(8);  Z  (s)  =  1.  (2.18) 

s  +  (k/l) 


Therefore 


01l('')  =  cos  (pT) 


(2.19) 


where 


=  k/l. 


(2.20) 


Similarly  the  entire  $  and  A  matrices  can  be  found,  and  the 
vector-difference  equation  is  written  as 


z(n  +  l)  = 

cos  (pt) 

(1/p)  sin  (px) 

z(n)  +  Y 

(l/p^)tl  -  cos  (px)] 

-p  sin  (px) 

cos  (px) 

(1/p)  sin  (px) 

u(n). 


The  performance  index  for  this  example  Is  chosen  as 

00 

JJ1(0)]  zl(n). 
n®l 


(2.21) 


(2.22) 


The  solutions  to  both  of  the  preceding  examples  are  discussed  in 
detail  in  Chapter  VI. 


-  9  - 


SUDAER-11i8 


III.  RECURRBSCE  REIATIOliS 


A.  SOLUTION  WITH  UNBOUNDED  CONTROL 

Before  considering  the  case  where  the  control  u(n)  Is  bounded,  the 
solution  to  the  unbounded  control  problem  will  be  derived  in  detail. 

Here  there  is  no  slo^llflcatlon  in  having  u(n)  a  scalar.  The  system 
is 


£(n  +  l)  =  ♦z(n)  +  ^(n)  (3>l) 

y(n)  =  Mn).  (3.2) 

Given  an  initial  condition  z{0)  the  control  sequence  u(0),  u(l),  ..., 
u(N  -  l)  Is  to  be  found  that  minimizes  the  performance  index 

N 

[z'^(n)%(n)  +  ^{n  -  l)r^(n  -  l)].  (3.3) 

n=l 

The  solution  begins  by  defining  ljj[_z(0)],  called  the  optimal  return 
function,  as  the  minimum  value  of  Jjj[_z(0)].  This  ljj[z(0)]  has  a  known 
and  slnqile  form: 


ljj[z(0)]  =  /(O)Pj^(O), 

(3.4) 

where  Pjj  is  a  symmetric,  positive  semidefinite  matrix, 
is  correct  will  be  proved  by  induction  later. 

That  this  form 

By  definition 

Nfl 

_  “in  “in  V  r  T,  .  T. 

=  u(o)  u(i)  •••  u(N)  L  u  (n  - 

n=l 

l)r^(n  -  1)]. 

(3.5) 

Since  £(l)  is  determined  solely  by  the  choice  of  u(0) 
other  u(n),  Eq.  (3.5)  can  be  factored  as 

and  not  by  the 

SUDABR.l^ 
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ijj^i[£(o)]  -  Ijo)  +  /(o)ru(o) 

»+i 

*  S)  S)  -  SS)Z  *  s*<“  -  -  ^)i}  • 

*  ““  *N-0  J 


(3.6) 


Noting  that  the  last  group  of  terms  is  exactly  I^[z(l}]  gives 

%l[£(0)]  =  +  /(O)IM(O)  +  Ijj[z(l)]}.  (3.7) 

The  above  equation  could  have  been  arrived  at  directly  using  the 
Principle  of  Optimality  [Ref.  9].  This  principle  states  that  the  minimum 
cost  of  an  N+1  stage  process  Is  the  minimum  of  the  sum  of  the  cost  of 
the  first  stage  and  the  minimum  cost  of  the  remaining  N  stages.  (Note 
that  the  arguments  of  the  state  and  control  variables  increase  with  time, 
while  the  subscript  on  Ijj  decreases  with  time.) 

Substituting  Eqs.  (3.1)  emd  (3.*^)  into  (3.?)  gives 

Au(0)f(Q  +  Pj^)F»z(0)  +  Mo)]  +  /(0)M0)). 

(3.8) 

Completing  the  square  on  the  right  side  of  (3.8)  and  defining 
u'(0)  = 

=  -[A^(Q  +  Pjj)a  +  r]'V(Q  +  Pjj)*!*  (3.10) 

transforms  Eq.  (3.8)  into  Eq.  (3.1l): 

iih.i[£(o)]  =  5o)  +  r][u(o)  .  u'(o)] 

+  /(0)1?^(Q  +  Pj,)*z(0) 

-  /(0)aJ^i[a’^(Q  +  Pjj)/^  +  r]kj^^z{0)] .  (3.11) 
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The  control  u(0)  occurs  in  only  the  first  term  of  (3.11).  If  the 
T  T 

matrix  [A  (Q  +  +  r]  is  positive  definite  the  optimal  control  is 

unique.  Then  the  minimum  value,  zero,  of  this  first  terra  occurs  only  at 

u(0)=u'(0).  (3.12) 

The  matrix  will  be  positive  definite  if  r  is  positive  definite  or  if 
Q  is  positive  definite  and  the  colimns  of  A  are  linearly  independent. 

It  will  not  be  positive  definite  if  r  =  0  and  the  columns  of  A  are 
linearly  dependent  [Ref.  12].  In  other  cases  this  matrix  might  be  singu¬ 
lar,  although  no  such  difficulty  was  encountered  in  the  examples  of 
Chapter  VI. 

Equation  (3.12),  along  with  Eqs.  (3*9)  and  (3.IO),  defines  the  optimal 
value  of  u(0). 

The  recurrence  relation  for  is  determined  by  equating  Eq.  (3.II) 

with  (3.^)  when  u(0)  =  u'(0). 

Ijl^i(z(0)]  =  /(0)<i.'^(Q  +  Pjj)*2(0) 

-  /(o)aJ^^(a’’(q  +  Pj^)A  +  r]Aj^^j^z(0) 

Since  (3.13)  must  hold  for  all  z(0)  the  recurrence  relation  becomes 

'■ihl  V**  "  <3-1’*) 

where  the  relation 


-aJ^i[a'^(Q  +  Pjj)^  +  r]  =  ^^^(Q  +  Pjj)A  (3.15) 

has  been  used  to  simplify  (3.1^). 

Eqxiation  (3.13)  shows  that  if  the  quadratic  form  for  Ijj  is  correct, 
then  Ijj^^  has  the  same  form.  The  quadratic  form  is  trivially  correct 
for  Iq  since 

Iq(z(0)]-0  (3.16) 
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for  all  z{0) .  To  complete  the  mathematical  Induction  the  form  for 
must  he  shown  to  be  correct.  is  determined  from  Eqs.  (3>13)  and 
(3.10)  noting  that  Pq  ■  0.  Equation  (3«15)  is  again  used  to  simplify 
the  result. 


i^[z(o)]  =  /(o)4?Q[®  -  a(aV  +  r)'^A%]z(o).  (3.17) 

Thus  l2^[£(0)]  has  the  required  quadratic  form. 

The  solution  proceeds  as  follows:  Since  Iq[z(0)]  =  0,  Pq  =  0.  Be¬ 
ginning  with  Pq  ®  0  calculate  A^.  From  and  P^  calc\ilate  P^. 
This  iteration  process  is  continued  until  all  the  Ajj  of  interest  axe 
calcTilated.  If  the  plant  is  controllable  the  A^  will  tend  to  a  limit 
as  N  increases  [Refs.  11,  12].  Therefore,  for  the  infinite  stage 
regulator  problem  the  optimal  control  in  the  unbounded  case  takes  the 
form  of  a  stationary,  linear  function  of  the  states. 

B.  RECURRENCE  RELATIONS  WITH  BOUNDED  CONTROL 

In  the  first  part  of  this  section  the  control  u(n)  will  be  a  vector 
of  any  dimension.  This  will  make  it  possible  to  use  Eqs.  (3*l8)  throujdi 
(3«25)  in  Chapter  V,  where  two-dimensional  control  is  considered  in 
detail.  When  the  actvial  minimization  over  u(0)  is  done  in  this  section, 
u(0)  will  be  considered  a  scalar. 

Limiting  the  possible  range  of  the  control  u(n)  tc 

g’<u(n)<a'^  (3.18) 

complicates  the  solution  greatly.  The  derivation  in  this  section  is  the 
same  as  that  in  Sec.  A  up  to  Eq.  (3*7) •  Equation  (3*7)  becomes 

Vi[z(o)]  =a-<5o)<a^  (/(i)Qz(i)  +/(o)Mo)  +  i^tz(i)]). 

(3.19) 

ljj[z(0)  ]  takes  the  form,  as  will  later  be  proved  by  induction, 

ljj[z(0)]  =  /(O)Pj^(O)  +  /(O)Rj,  +  rJz(O)  +  Cjj  (3.20) 
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where 


Pjj  is  an  (m  X  m)  positive  semidefinite  sytmnetric  matrix^ 

is  an  (ra  X  l)  vector, 

C„  is  a  scalar, 
ri 

Substituting  (3*l)  and  (3*20)  into  (3-19)  gives 

]  =  ct-<  u(0)  <«"■  ^  V 

+  U^(O)MO)  +  [4>z(0)  +  Au(0)fRjj  +  R^f^zCO)  +  Au(0)  ]  +  C^j) . 

(3.21) 

Again  completing  the  square  on  u(0)  gives 

In^.i[1(o)]  =  a-<5o)<a''^f3i(o)  -  u*(o)f[A^(Q  +  p^)A  +  r][u(o)  -  u-(o)] 

+  /(0)«^(Q  +  Pjj)<te(0)  +  /(0)*\  +  Rj<te(0)  +  Cjj 


(3.22) 


where 

=  -[a'^(Q  +  Pjj)A  +  r]‘^A'^(Q  +  Pjj)«>  (3.23) 

as  before,  and 

=  -[^’^(Q  +  Pj,)^  +  (3.24) 

®N+1  •  (3.25) 

The  next  step  is  to  choose  the  u(0)  that  minimizes  Eq.  (3.22). 

This  is  easy  when  u(0)  is  a  scalar  or  when  the  distribution  matrix  A 
is  an  (m  X  m)  diagonal  matrix— an  unlikely  possibility.  For  the  rest  of 
this  chapter  and  in  Chapter  IV,  u(0)  will  be  considered  a  scalar,  that 
is,  there  is  only  one  controlling  input  to  the  system. 
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T 

Since  [A  (Q  +  +  T]  is  supposedly  nonsingular  (it  is  in  fact  a 

positive  scalar)^  the  minimum  of  Eq.  (3*22)  occurs  at 

ra*  if  u'(0)>a'^ 

u(0)  =  <  u'(0)  if  a'  <  u'(0)  <  (3.26) 

Va'  if  u'(o)  <  a” 

The  final  step  is  to  derive  recurrence  relations  for  ^Wl’ 

and  C„  , .  The  existence  of  these  relations  gives  the  necessary  proof 
IJ+l 

that  the  form  assumed  for  Ijj[2(0)]  is  correct.  Proof  that  the  form  for 
l^[z(0)]  is  correct  is  the  same  as  in  Sec.  A  and  will  not  be  repeated. 

The  recurrence  relations  are  different  depending  on  whether  or  not 
u(0)  is  saturated.  When  u(0)  is  unsaturated,  that  is,  when 
a”  <  u'(0)  <  0(^,  the  relations  can  be  obtained  by  equating  (3 '20)  with 
(3.22)  along  with  u(0)  =  u'(0). 

'  £’'(o)/(9  Pj)«2(o)  ♦  +  i^(o)  t 

-  .  p„)A  *  r][Vi£(o)  . 

"  +  Vr  (3.27) 

Thus  the  recurrence  relations  when  u(0)  is  unsaturated  are 

^N+1  ^N+1^ 

(3.29) 

°»H  ■  S  *  (3.30) 

where  the  simplifying  relations 

-aJ^i[^^(Q  +  Pij)^  +  r]  =  «i»^(Q  +  Pj,)A  (3.31) 

and 
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*  rj  = 


(3.32) 


have  been  used. 

When  u(n)  is  unbovinded,  these  recurrence  relations  reduce,  as  they 

must,  to  that  given  in  the  unbounded  case,  Eq.  (3-1^).  The  equations  tot 

A„  ,  and  for  P„  -  are  the  same  as  in  the  unbounded  case.  Since  R.  =  0, 
lH-1  N+1  0 

all  Rjj  =  0.  Since  all  =  0,  all  =  0.  Finally,  since  =  0, 

all  C„  =  0. 

N 

The  recurrence  relations  when  u(0)  is  saturated  can  be  determined 
by  substituting  u(0)  =  a,  where  a  represents  either  or  a”,  into 
either  Eq.  (3-2l)  or  (3*22)  and  equating  the  result  with  (3-20).  Equating 
(3.21)  with  (3*20)  gives 

+  i^CC] 

+  [<t>z(0)  +  £a]\+  Rj[<te(0)  +/:«]  +  Cjj 
=  /(0)Pj,^^z(0)  +  /(O)Rj,^^  +  4,iz(0)  + 

(3.33) 

The  recurrence  relations  when  control  is  saturated  are  thus 

^ih-1  =  +  V’’’ 

Rjj^l  =  4.^[Rjj  +  (Q  +  Pjj)to]  (3.35) 

^^1  =  S  ^  ^3.36) 

By  the  same  surguments  used  in  the  unsaturated  control  case,  the  form 
of  lu[z(0)]  is  shown  to  be  correct  by  mathematical  Induction. 

The  principal  equations  derived  in  this  section  are  summarized  at 
the  end  of  this  chapter. 
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C.  DISCUSSION 


Beginning  with  a  zero-stage  process  and  calculating  backward  in  time, 
as  long  as  all  the  stages  have  optimally  unsaturated  control,  the  Rj^, 

Cjj,  and  remain  zero.  The  first  stage  backward  in  time  -hat  is 

saturated  causes  and  to  be  nonzero,  and  they  will  remain  non¬ 
zero  for  the  rest  of  the  stages. 

If  it  were  known  a  priori  which  terms  of  the  optimal  control  sequence 
u(0),  u(l),  ...,  u(N  -  l)  were  saturated  and  \diich  were  not,  the  solu¬ 
tion  would  proceed  simply  as  in  the  unbounded  case.  Beginning  with  a 
one-stage  process,  and  B^  could  be  calculated.  This  requires  no 

knowledge  of  whether  or  not  the  optimal  control  is  saturated.  Next, 
knowing  whether  the  optimal  control  u{0)  equals  a^,  a",  or  is  unsatur¬ 
ated,  R^,  and  could  be  calculated.  This  computational  scheme 

could  be  continued  for  as  many  stages  as  desired. 

Unfortunately,  nothing  is  known  about  the  control  sequence  before¬ 
hand;  thus  the  above  computational  scheme  cannot  be  used.  At  each  stage 
it  is  not  known  whether  to  use  the  recurrence  relations  for  unsaturated 
or  for  saturated  control.  A  computational  method  that  does  not  require 
this  a  priori  information  is  needed.  Such  a  method  will  be  discussed  in 
Chapter  IV. 

The  method  described  in  the  second  paragraph  of  this  section  is  still 
useful,  however,  and  it  has  the  advantage  that  it  is  exact.  It  can  be 
used  to  perfect  estimates  of  the  optimal  control  obtained  by  other  methods. 
For  example,  suppose  the  optimal  control  sequence  was  determined  by  a 
method  requiring  a  discrete  state  space  such  as  dynamic  programming  or 
the  method  described  in  the  next  chapter.  Errors  due  to  quantizing  the 
state  space  will  build  up,  and  thus  the  true  minimum  and  the  true  optimal 
unsaturated  control  will  only  be  approximated.  Now,  however,  it  is  known 
whether  the  control  at  each  stage  is  saturated  or  not,  and  the  simple 
computational  scheme  above  can  be  applied  to  obtain  the  exact  optimal 
control.  Boiindaries  of  all  control  regions  of  the  examples  in  Chapter 
VI  were  checked  in  this  manner. 
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SttMARY  Of  ranCIPAl  EqUATIOHS  TOR  8IWQLE.IMPUT  COMTROL 


Optimal  return  function 

“  /(O)Pj^(o)  +  /(O)Rj,  -t-  1^(0)  +  Cjj  (3.20) 


Optimal  control 

r 0*  if  u’(0)  >  (saturated) 

u(0)  = 

<  u'(0)  If  a“  <  u’(0)  <  a*  (unsaturated) 

va”  If  u*(0)  <  a~  (saturated) 

(3.26) 

Definitions 

u'(0)  =  +  ®N+1 

(3.25) 

=  -[-!^’^(Q  +  Pjj)A  +  r]‘^A^(Q  +  Pjj)'P 

(3.23) 

(3.24) 

Recxirrence  relations: 

Unsaturated 

control 

Pj^l  =  <»''(Q  +  Pi,)(<«>  + 

(3.26) 

\fl  =  ^ 

(3.29) 

°»fl  “  *^N  V^Nfl 

(3.30) 

Saturated  control 

Pu+i  =  ♦^(Q  +  V* 

Rj^l  »  +  (Q  +  Pjj)^]  (3.35) 

Cjj^i  «  Cjj  +  a^[A^(Q  +  Pjj)^  +  r]  +  ao{A\  (3-36) 

starting  conditions 

Pq  -  0»  Rq  “  °0  “  °- 
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IV.  OMPUTATIONAL  ASPECTS 


This  chapter  describes  a  method  of  using  the  equations  derived  In 
Chapter  III  to  determine  the  optimal  control  of  any  system  described  by 
Eqs.  (2.1)  and  (2.2).  This  method,  which  requires  a  digital  computer, 
calculates  the  optimal  control  from  any  point  within  a  bouzided  region  of 
state  space  for  the  Infinite  stage  regulator  problem.  Useful  facts  per¬ 
taining  to  the  actual  computations  are  discussed. 

Before  describing  the  computing  method  recommended  In  this  report, 
the  etrai^t  dynamic  programming  approach  will  be  briefly  discussed  for 
comparison. 

A.  DYNAMIC  PROGBAI«(im  APPROACH 

The  basic  dynamic  programming  approach  to  the  problem  is  straight¬ 
forward  but  requires  a  very  large  and  very  fast  digital  computer  to 
solve  for  the  optimal  control  of  even  small  systems.  This  method  repeat¬ 
edly  uses  the  fundamental  functional  equation  of  dynamic  programming 
[Ref.  9]  which,  put  into  the  form  required  for  this  problem,  is 

("•D 

where 

]  =  /(l)Qz(l)  +  /(o)ru(o),  (4.2) 

and  Ijj[z(l)]  is  the  minimum  cost  associated  with  initial  condition 
z(l).  Equation  (4.1)  is  recognized  as  beiixg  the  sane  as  Eq.  (3«i9)* 

Althou^  only  the  single-input  case  is  being  considered  in  this 
chapter,  functions  of  the  control  u(n)  will  be  written  in  vector-matrix 
form  for  use  later  in  this  report  and  for  future  work.  Of  course  in 
the  single-input  case  the  last  term  of  Eq.  (4.2)  is  8inq)ly  u  (0)r. 

In  words,  Eq.  (4.1)  states  that  the  minimum  cost  from 

initial  state  z(0)  is  the  minimum  over  the  allowable  values  of  the 
control  u(0)  of  the  sum  of  the  cost  of  the  first  step,  which  takes 
the  state  to  z(l),  plus  the  minimum  cost  of  being  in  state  s(l). 
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Before  computing,  both  the  state  space  and  the  control  space  must 
be  quantized;  that  is,  a  discrete  set  of  values  is  chosen  over  which 
the  calculations  are  to  be  made.  This  set  must  be  dense  enou£^  to  pre¬ 
vent  errors  from  accumulating  during  the  calcvilatlons  as  the  result  of 
interpolation. 

The  calculation  is  divided  into  two  parts;  First  the  ljj[_z(0)], 
called  the  optimal  return  ftmctions,  are  calculated  backward  in  time  for 
all  N  and  all  z{0) .  Second,  if  actual  optimal  trajectories  are 
desired,  these  are  calculated  forward  in  time  using  the  optimal  control 
calculated  in  the  first  part. 

The  first  part  of  the  calctilatlon  is  the  time-consuming  part.  Be¬ 
ginning  with  Iq[£(1)]  =  0  for  all  ^(l),  l^[z(0)]  is  calculated  from 

Eqs.  (4.1)  and  (4.2)  and  the  state-transition  eqiiation 

^(l)  =  4>z(g)  +  Au(0).  (4.3) 

For  a  given  value  of  z(0)  and  for  each  value  of  u(0),  J^[z(l)]  is 

calculated  and  the  minimum  is  stored  as  I^[z(l)].  The  optimal  value  of 
u(0)  is  also  stored.  This  calculation  is  performed  for  each  z(0). 

Now  that  the  values  of  I^[£(l)]  are  known  for  all  z(l),  the 
l2[£(0)]  can  be  calciilated,  again  using  Eqs.  (4.1)  and  (4.2)  along 
with  the  state-transition  equation  (4.3).  Since  the  z(l)  calculated 
from  z{0)  by  the  state -transition  equation  will  probably  not  be  one 
of  the  discrete  vedues  for  which  the  I^[z(l)]  were  calculated,  the 
correct  value  of  Ij^[£(l)]  to  use  in  Eq.  (4.1)  must  be  found  by  inter¬ 
polation.  It  is  the  interpolation  that  causes  the  most  significant 
errors  to  arise  in  the  computation.  Higher  order  than  linear  interpola¬ 
tion  can  be  iised,  but  since  the  interpolation  must  be  done  a  very  great 
number  of  times  the  computing  time  is  increased  significantly. 

The  process  described  in  the  last  paragraph  is  continued  until  the 
optimal  return  functions  and  the  optimal  control  for  the  desired  N 
stages  aie  calculated.  In  the  case  of  the  infinite  stage  regulator 
problem,  stages  must  be  calculated  \mtil  the  optimal  control  for  each 
z{0)  at  stage  Nfl  is  the  same  as  the  optimal  control  for  each  z(0) 
at  stage  H.  This  may  require  very  many  stages  of  calculation. 
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The  feet  memory  requirements  at  each  stage  are  three  words  for  each 
value  of  £(0):  an*  the  optimal  u(o)  for  steige 

Rfl.  Thus,  for  example,  a  two-dimensional  problem  with  100  values  of 
and  100  values  of  z^  would  require  30,000  words  of  fast  memory 
storage.  This  is  approaching  the  limit  of  present-day  computers.  A 
three-dimensional  problem  with  100  points  to  each  dimension  would  require 
3,000,000  words  of  storage,  thus  exceeding  the  limit  of  present  computers -- 
a  difficulty  often  inferred  to  as  the  "curse  of  dimensionality." 

The  method  discussed  next  for  conq>uting  the  special  problem  considered 
in  this  report  requires  far  less  computing  storage  and  computing  time 
than  does  straight  dynamic  programming. 


B.  A  COMPOTIHG  METHOD 

The  basic  dynamic  programming  aJ-gorism  makes  no  use  of  the  recurrence 
relations  derived  in  Chapter  III.  By  taking  advantage  of  this  additional 
knowledge  about  the  solution,  considerable  savings  can  be  made  in  both 
computer  time  and  memory,  making  it  possible  to  solve  much  larger  prob¬ 
lems. 

To  facilitate  the  discussion  of  the  computing  method,  which  involves 
calciilating  regions  of  optimal  control,  several  definitions  will  be 
made: 


1.  Region  of  linear  control.  In  the  infinite  stage  regulator  problem 
there  exists  a  region  about  the  origin  in  state  space  where  the 
control  for  the  first  and  all  future  stages  is  unsatiirated .  Such 
a  region  will  always  exist  if  the  plant  is  controllable,  since  in 
the  unbounded  control  case  the  control  is  a  linear  function  of 
the  states  and  is  zero  at  the  origin.  This  region  will  be  called 
the  region  of  linear  control,  or  simply  the  linear  region. 

2. '  Region  of  first  saturation.  If  z(0)  is  not  in  the  region  of 

linear  control,  at  least  one  stage  before  the  state-space  trajec¬ 
tory  reaches  the  linear  region  will  have  saturated  control.  The 
first  stage  backward  in  time  (or  the  last  stage  forward  in  time) 
that  is  saturated  will  be  called  the  region  of  first  saturation. 

3.  Unsaturated  region.  Any  region  where  the  control  u(0)  is  given 
by  tiie  equation  u(0)  =  u'(0)  =  A.u+]z(0)  +  B.jj+j^  will  be  called  an 
unsaturated  region.  The  region  of  linear  control  is  an  unsaturated 
region,  but  there  will  be  others.  Although  the  control  in  any 
unsaturated  region  is  linear,  the  term  "linear  region"  will  refer 
only  to  the  region  of  linear  control. 
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4.  Saturated  region.  This  is  a  region  idiere  the  control  Is  either 
u(0;  =  Of*'  or  u(0)  =  a”.  Throughout,  the  term  a  will  be  used 
to  denote  either  a"*"  or  ,  a".  Saturated  regions  will  also  be 
referred  to  as  alpha-plus  regions  or  alpha-minus  regions. 

5.  a(1)  region.  This  refers  to  a  region  that  has  already  been  calcu¬ 
lated,  and  from  which  new  regions  will  be  calculated. 

6.  ^(o)  region.  A  z(0)  region  is  one  which  is  being  presently 

calculated  from  a  z(l)  region.  Regions  are  calculated  backward 
in  time  as  in  dynamic  programming;  thus  a  z(0)  region  is  calcu¬ 
lated  from  a  z(l)  region.  (The  actual  trajectories  are,  of 
course,  from  a~  z(0)  region  to  the  z(l)  region  frcjm  which  it 
was  calculated . )”  ” 

The  method  to  be  described  in  detail  is  basically  as  follows:  First 
the  optimal  feedback  coefficients  for  the  infinite  stage  regulator 

problem  with  unbounded  control  are  calculated.  Once  is  known, 

the  region  of  linear  control  can  be  computed.  Using  the  same 
the  two  regions  of  first  saturation  are  calculated.  From  each  of  these 
regions  of  first  saturation  are  calculated  an  alpha-plus  region,  an 
•  unsaturated  region,  and  an  alpha-minus  region.  Further  regions  are 
calculated  from  each  of  these  last  regions,  and  the  process  is  continued 
until  all  the  state  space  of  interest  is  covered  with  regions. 

In  essence,  assimiing  N  stages  are  being  calculated  backward  from 

the  linear  region,  this  method  considers  all  possible  control  sequences 

u(0),  u(l),  ...,  u(N-l),  and  determines  the  optimal  sequence  for  each 

point  in  state  space.  Since  at  each  stage  the  control  can  take  one  of 

three  values— u'(0),  or  a’ — it  might  be  thou^t  that  this  method 

N 

requires  considering  3  possible  control  sequences,  a  staggering 
possibility.  In  practice,  the  number  of  control  sequences  considered 
is  far  less.  Most  of  the  sequences  will  be  found  to  be  optimal  for  no 
points  in  state  space,  and  these  sequences  can  be  dropped  from  further 
consideration  as  soon  as  they  are  discovered.  The  method  described  here 
determines  these  nonoptlmal  sequences  at  the  earliest  possible  time 
during  the  computing. 

As  in  dynamic  programming,  the  state  space  must  be  quantized.  How¬ 
ever,  the  control  is  determined  by  the  formulas  of  Chapter  III  and  is 
not  quantized. 
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A  flow  diagram  of  the  computing  is  given  in  Fig.  3.  First  the 
optimal  control  feedback  coefficients  for  the  linear  region  are 

calculated  along  with  the  corresponding  matrix,  using  the  recurrence 
relations  for  unsaturated  control.  Beginning  with  ^  =  0,  A^^  is  cal¬ 
culated.  From  Aj^,  is  calculated.  From  A^  is  calculated. 

This  iterative  procedure  is  continued  until  the  Ajj^^  converge  to  a 
limit.  That  these  will  converge  is  discussed  by  Kalman  [Ref. 

11]  and  Gunckel  [Ref.  12].  The  unsaturated  control  recurrence 


be  zero  for  all  N,  since  Rq  =  0  and  =  0. 


these  to 


Second,  the  region  of  linear  control  is  calculated.  The  optimal 


control  formulas  for  this  region  are 


u(n)  =  u’(n)  =  k^^z{n) 


a"  <  u’(n)  < 


(^.5) 


for  all  n,  where  the  is  that  calculated  in  the  first  step.  Two 

bounds  on  this  region  can  be  found  immediately  by  setting  u(n)  in 
Eq.  (4.4)  equal  to  and  a”.  Thus  two  bounds  are 

a"  =  (^>6) 

For  each  ^(O)  on  and  within  the  boundaries  (4.6)  calculate 

z(l)  =  (♦  +  AAjj^j^)z(O).  (4.7) 

Only  those  z(0)  which  determine  z(l)  that  are  on  and  within  the 
boundaries  (4.6)  can  be  in  the  region  of  linear  control.  From  each 
£(0)  within  the  boundaries  (4.6),  enoxigh  points  forward  in  time  must 
be  calculated  to  ensure  that  the  z(0)  is  actually  in  the  linear  region. 
In  the  two-dimensional  examples  of  Chapter  VZ,  where  ^  ^3t~,  only 
£(0)  and  z(l)  both  needed  to  be  within  the  boundaries  (4.6).  In 
genersd  more  stages  must  be  calculated. 
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FIG.  3.  FLOW  DIAGRAM  OF  COMPUTING  METHOD. 
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The  third  step  Is  to  calculate  the  two  regions  of  first  saturation. 
These  eure  the  ^(O)  regions  that  go  optimally  into  the  linear  region 
with  u  =  a.  Thus  the  alpha-plus  region  Is  defined  by  Inequality  (4.8) 
and  Eq.  (4.9) : 


u'(0)  = 

z(0)  =  (I."^[z(l)  - 


(4.8) 

(4.9) 


where  z(l)  is  In  the  linear  region.  The  Is  the  same  as  that 

used  In  calculating  the  linear  region,  since  it  is  derived  from  the  same 
Pjj.  The  alpha-minus  region  Is  defined  in  a  similar  manner.  If  a*  = 
-a",  the  alpha-minus  region  (and  all  regions  derived  from  it)  need  not 
be  calcxilated,  since  it  Is  symmetric  with  respect  to  the  origin  to  the 
alpha-plus  region  (and  those  derived  from  it).  Finally,  the  Pj^,  Rjj, 
and  Cjj  are  calculated  for  the  regions  of  first  saturation,  using  the 
saturated  control  recurrence  relations. 

The  above  steps  are  essentially  initializing;  the  principal  calcu¬ 
lations  now  begin.  There  are  now  two  z(l)  regions  from  which  to 
calculate- -the  two  regions  of  first  saturation.  Consider  the  calcu¬ 
lations  from  one  of  these.  First  the  and  are  calculated 

using  the  Pj^  and  from  the  z(l)  region.  The  optimal  control 
for  three  z{0)  regions — an  alpha-plus  region,  an  unsaturated  region, 
and  an  alpha-minus  region— is  determined  from  these  Ajj^^  and 
Each  of  these  three  regions  must  satisfy  two  relations  as  follows: 


1.  Alpha-plus  region. 


z(0)  =  ♦"^[z(l)  -  (4.10) 

u'(0)  -  Ajj^i£(0)  +  (4.11) 

2.  Unsaturated  region. 

z(0)  -  («  +  Mj^j^)"^(z(l)  -  (4.12) 
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a'  <  u'(0)  -  Aj^iz(o) 


(‘♦.13) 


3.  Alpha-minus  region. 

z(0)  =  «"^tz{l)  -  Zto']  (4,l4) 

u'(0)  =  Aj^iZ(O)  +  <  a’  (4.15) 

In  each  of  these  equations  z(l)  is  in  the  z(l)  region,  and  for  a 
z(0)  to  he  in  the  new  z(0)  region,  both  the  equation  and  the  ine¬ 
quality  for  that  region  must  be  satisfied.  Most  of  the  regions  calculated 

will  be  found  to  contain  no  states  z(0).  It  is  for  this  reason  that 

iT 

there  are  considerably  less  than  3  regions  to  consider. 

The  Eqs.  (4.10),  (4.12),  and  (4.l4)  are  written  as  though  z(0) 
will  always  be  calculated  from  z(l)  throu^  an  inverse  relation.  It 
is  of  course  equally  possible  to  calculate  z(l)  from  z(0)  by 

z(l)  =  <&z(0)  +  Au(0)  (4. 16) 

for  all  z(0)  in  the  quantized  state  space  and  keep  only  those  ^(O) 
for  which  the  corresponding  £(l)  is  in  the  desired  z(l)  region. 

There  are  advantages  and  disadvantages  for  both  methods  of  computing. 
Calculating  z(l)  from  z(0)  is  easier  because  no  "holes"  can  develop 
in  the  z(0)  region.  (Holes  are  points  that  belong  within  a  region 
but  are  not  calculated  as  being  in  the  region.)  However,  because  a 
very  large  percentage  of  the  states  ^(O)  will  not  be  in  the  z(0) 
region,  considerable  computing  time  is  consumed  by  computing  ^(l) 
from  z(0). 

Computing  z(0)  from  z(l)  consumes  less  coR^tlng  time  because 
less  points  are  considered.  Only  those  ^(O)  calculated  from  the 
z(l)  in  a  particular  z(l)  region  are  considered.  However,  if  the 
z(0)  region  contains  more  points  than  the  ^(l)  region,  holes  will 
develop,  and  care  must  be  taken  to  eliminate  them.  This  is  paartlcularly 
a  problem  when  cetlculatlng  unsaturated  regions.  Also  the  points  z(0) 
calculated  from  z(l)  will  be  in  a  somewhat  random  order  in  the  computer 
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memory  and  time  must  be  taken  to  put  them  in  some  orderly  and  useful 
sequence. 

Equations  (4.10),  (4.12),  and  (4.l4)  assume  that  the  inverses  of 

certain  matrices  exist.  If  the  state  variables  have  been  chosen  so  that 

the  minimum  number  necessaxy  to  completely  characterize  the  system  is 

used,  the  matrix  <|>  will  be  nonsingular;  thus  its  Inverse  will  exist. 

The  other  matrix  assiuned  to  be  nonslngulax  is  (^P  +  M^^).  This  matrix 

is  nonsingular  if  r  is  not  zero.  However,  if  r  =  0  this  matrix  will 

always  be  singular.  The  following  proof  will  show  an  even  stronger 

result:  If  r  =  0  and  (♦  +  haa  dimension  (m  x  m),  and  A  has 

rank  q,  (m  li3  the  dimension  of  z  and  q  is  ordinarily  the  dimension 

of  u, )  then  (‘b  +  has  rsmk  no  greater  than  m  -  q. 

The  proof  is  as  follows.  Consider  a  square  matrix  M  of  dimension  m. 

T 

If  a  nontrivial  vector  c  can  be  found  such  that  c  M  =  0,  then  by  defi¬ 
nition  M  is  singular.  If  there  exist  q  nontrivial  linearly  independent 

vectors  c  such  that  there  are  q  linearly  independent  vector  equations 
T  “ 

£  M  =  0,  then  q  of  the  columns  of  M  are  linear  combinations  of  the 
other  m  -  q  columns.  The  kemal  of  M  is  at  least  q  and  its  rank  is 
no  greater  than  m  -  q. 

The  q  nontrivial  linesorly  independent  vectors  that  show  the  matrix 
(«b  +  ^as  rank  no  greater  than  m  -  q  are  the  columns  of  (Q  +  Pjj)^* 

Thus 

A^(Q  +  Pjj)(4.  +  AAj^^)  =  A^(Q  +  Pj^)(1>  .  A[A^(Q  +  Pjj)A]'^a'^(Q  +  P^^)*) 

=  0.  (4.17) 

This  singularity  can  therefore  be  predicted  in  advance  and  the  computer 
program  written  accordingly. 

The  next  step  is  to  calculate  the  Pjj,  Rj^,  and  for  the  z(0) 
regions  just  calculated  that  actually  contain  points.  Regions  that  are 
found  to  contain  no  points  are  ignored  entirely.  The  new  z(0)  regions 
are  now  stored  in  the  fast  memory  as  new  z(l)  regions.  The  old  z(l) 
region  can  now  be  discarded. 

The  output  can  include  a  description  of  the  points  in  the  region, 
the  type  of  region  (alpha-plus,  unsaturated,  or  alpha-minus),  and  the 

and  The  optlmed  cost  can  also  be  calculated  and 

written. 
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This  process  of  calculating  z(o)  regions  is  continued  until  the 
entire  state  space  of  Interest  is  covered  with  regions. 

C.  DISCUSSION 

This  new  program  runs  much  more  quickly  than  straight  dynamic  program¬ 
ming  because  the  optimal  control  for  each  point  is  known  from  the  recurrence 
relations.  The  memory  requirements  are  also  much  smaller  since  only  the 
z(o)  regions  and  a  single  £(l)  region  need  be  in  the  fast  memory  at  one 
time.  It  is  convenient,  however,  to  store  edl  unused  z(l)  regions  in 
the  fast  memory.  Because  of  greater  speed  and  less  storage  requirements, 
this  new  program  can  handle  problems  of  larger  dimension  than  can  be  run 
with  straight  dynamic  programming.  There  is  still,  however,  a  limit  to 
the  size  problem  that  can  be  run.  A  comparison  of  memory  requirements 
is  give;,  for  a  specific  example  in  Chapter  VI.  The  restriction  that  the 
control  be  a  scalar  is  removed  in  the  next  chapter. 

The  question  of  whether  regions  computed  in  this  manner  will  overlap 
is  still  open.  Such  an  overlap  did  not  occur  in  any  of  the  examples  of 
Chapter  VI.  If,  after  computing,  some  regions  do  overlap,  a  comparison 
of  the  optimal  costs  from  these  regions  can  be  made  using  their  respec¬ 
tive  Pjj's,  enabling  the  true  optimal  control  to 

be  chosen. 
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V.  TWO-INPUT  CONTROL 


So  far,  only  the  solution  to  the  single-input  case  has  been  completed 
In  detail.  This  chapter  extends  these  results  to  the  case  In  which  the 
control  u(n)  Is  a  two-dimensional  vector.  The  solution  Is  considerably 
conqpllcated  by  t;he  fact  that  one  of  the  Inputs  may  be  saturated  idille  the 
other  Is  not.  Although  they  are  not  discussed  here,  this  chapter  Indi¬ 
cates  the  extensions  and  changes  that  must  be  made  when  the  control  has 
dimension  higher  than  two. 

A.  THE  PROBLEM 

The  description  of  the  system  and  performance  criterion  Is  the  same 
eis  given  in  Chapter  II.  The  control  u(n)  and  Its  bounds  ot  and  a” 
are  now  two-dimensional  vectors. 


“1 

,  a  = 

,  a’  = 

1 

> 

1 

_ 1 

_ 1 

Control  Is  limited  by  the  vector  Inequality 

ot”  <  u(n)  <  a'^.  (5.2) 

The  problem  Is;  Given  any  initial  condition  z(0),  find  the  optimal 
control-vector  sequence  u(0),  u(l),  u(2),  ...  that  minimizes  the  perform¬ 
ance  index  J  [z(0)]. 

00 

B.  THE  SOLUTION 

The  equations  for  derived  in  Chapter  III  up  through  Eq. 

(3.25)  were  written  in  vector  notation  so  that  they  could  be  used  in 
this  chapter.  Equation  (3«2l)  is  written  here  as  the  starting  point  of 
the  solution; 

^Nfl^-  (0)]=„-<  u(^)<a+{[®z(0)  +  Au(0)f(Q  +  PjP[«I»z(0)  +  Au(0)] 

+  u^(o)ru(o)  +  (.i«(o)  +  ^(o)]\  +  rJ[®z(o)  +  Au(0)]  +  Cjj). 

(5.3) 
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As  In  Chapter  III  the  square  Is  completed  on  the  control  vector  u(0). 

The  result  is  Eq.  (3.22): 

+  Pjj)A  +  r][u(o)  -  u'(o)] 

+  /(O)qi^(Q  +  Pjj)«(0)  +  ^{oyb\  +  r5i«(0)  +  Cjj 

-  [W(0)  -  ^ 

(5.4) 

where  ®Mfl'  ii'(O)  are  given  by  Eqs.  (3.23)  through  (3.25). 

The  minimum  without  regard  to  bounds  occurs  at  u(0)  =  u'(0).  If 
the  resulting  u(0)  satisfies  the  vector  inequality  (5.2),  then  u(0)  = 
u'(0)  is  the  optimal  control.  However,  if  one  or  both  of  the  elements 
of  u'(0)  are  out  of  bounds,  the  situation  is  much  complicated. 

Before  presenting  a  careful  algebraic  discussion  of  a  method  for 
finding  the  mlnimvim  of  Eq.  (5.4),  a  more  intuitive  geometrical  discussion 
will  be  given. 

C.  GEOMETRICAL  DISCUSSION  OF  THE  MINIMUM 

Figures  4a  through  4f  show  the  two-dimensional  control  space.  Each 
point  represents  a  particular  control  (u^,  u^) •  The  rectangle,  whose 
sides  are  given  by  u^^  =  a^,  u^^  »  Ug  =  a^,  and  Ug  =  Og,  bounds  the 
region  of  allowable  control. 

Geometrically,  a  positive  definite  quadratic  function  in  two  variables 
is  an  ellipse.  In  each  figure  are  drawn  concentric  ellipses,  which  eu:e 
loci  of  constant  Jjj^^[£(0)].  The  vsaue  of  Jjj^2^[z(0)]  decreases  as 
the  ellipse  size  decreases.  The  absolute  minimum  occurs  at  the  center 
of  the  ellipses,  which  has  coordinates  (u^,  u^). 

The  geometricEil  problem  then  Is  to  find  the  point  in  the  control 
space  that  is  both  on  the  smallest  possible  ellipse  and  in  or  on  the 
rectangle.  Algebraically  this  is  the  same  problem  as  expressed  by  the 
now-famlliar  equation 
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If  the  center  of  the  ellipses  is  in  the  rectangle,  as  shown  in  Pig. 
then  the  minimum  occurs  at  (u^,  u^).  If  the  center  is  outside  the 
i-ectangle,  as  shown  in  Pigs.  4b  through  4f,  then  it  is  clear  that  the 
minimum  occurs  on  the  boundary,  since  any  control  within  the  recteuigle 
is  on  a  larger  ellipse  than  one  either  tangent  to  the  boiandary  or  touch¬ 
ing  a  comer. 

Figures  4b  and  4c  show  the  case  where  one  uj,  in  this  case  u^,  is 
greater  than  its  bound,  while  the  other  u|  is  within  its  bounds.  In 
both  cases  the  optimal  control  of  u^  is  ot ,  but  the  optimal  value  of 
u^  can  be  anything.  To  determine  u^,  the  optimal  value  of  u^  =  CC^ 
is  substituted  into  Eq.  (5.3)  or  (5.4),  and  by  completing  the  square 
on  u^  the  value  of  u^  that  minimizes  the  function  is  found.  This 
value,  called  u^,  may  or  may  not  be  in  bounds.  The  optimal  u^  is 
then 


r  + 

if 

u^  > 

+ 

OL 

1 

(Fig.  4b) 

Ui=  ^ 

u^ 

if 

u" 

1 

<a^  (Fig.  4c) 

(5.6) 

i 

i 

5 

if 

u^< 

“I 

\ 

T 

1 

Note  that  the  value 

of  u^ 

does  not 

in 

any  way  indicate  the 

optimal 

value  of  Uj^ . 

Figures  4d  through  4f  show  cases  where  both  u^  and  u^  are  outside 
the  bounds.  In  these  cases  all  that  can  be  said  without  further  calcula¬ 
tion  is  that  at  least  one  of  the  u^  gives  the  optimal  control. 
Geometrically  this  means  that  the  optimal  control  is  on  one  of  the  two 
boundaries  nearest  the  center  of  the  ellipse,  a  fact  that  will  be  proved 
algebraically  in  the  next  section.  Since  it  is  not  known  which  uj  gives 
the  correct  result,  both  u^  and  u^  must  be  calculated.  Assume  u^ 
and  are  greater  than  and  Og  respectively  as  in  the  figures. 

Then  u£  is  cedculated  as  the  optimal  value  of  (neglecting  satura¬ 

tion  of  u^)  with  Ug  =  Cg,  and  Ug  is  likewise  calculated.  Since  only- 
one  of  the  assxmptions  made  in  calcxilating  the  u^  was  necessarily  cor¬ 
rect,  only  one  of  the  u^  is  necessarily  correct.  However,  as  shown  in 
the  next  section,  bo-th  uj|  calcxilated  determine  correctly  their  respective 
optimal  Uj^. 
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If  the  correct  optimal  value  of  Ug  is  eis  shown  in  Figs, 
through  ^f,  then  the  vedue  of  u^  calculated  is  correct.  The  optimal 
u^  is  thus 


'“I 

if 

u^  >  < 

(Fig.  1(d) 

if 

“i  < 

(Fig.  1+e)  (5.7) 

if 

uj;<a- 

(Fig.  4f) 

Note  that  even  though  the  value 

of  u|  suggests  that  the  optimal  value 

of  Uj^  is  the  real  optimal  value  can  be  far  different,  even 

The  next  section  algebraically  proves  that  the  u^,  where  calculated, 
give  the  optimal  values  of  in  all  cases. 


D.  ALGEBRAIC  DETERMIMATION  OF  THE  MINIMUM 

To  simplify  the  notation  in  this  section,  consider  only  the  part  of 
Jjj^^[£(0)]  that  is  q\xadratic  in  u.  This  is 

5  =  (u  -  u’)^K(u  -  u’)  (5.8) 


where 


and 


(u  -  u') 


K  =  [a’^(Q  +  Pjj)A  +  r]  = 


(5.9) 


(5.10) 


K  is  a  positive  definite  symmetric  matrix,  and  hence  the  smallest 
value  6  can  have  is  zero,  which  occurs  only  at  u  =  u'.  If 
a  <H'  oP’timum  value  of  u  is  clearly  u  =  u'. 

The  quantities  u^  have  a  somewhat  more  general  meaning  in  this 
section  than  in  the  leist  section.  For  example  u^,  is  the  optimum  value 
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of  (ignoring  saturation)  for  any  given  value  of  u^,  not  Just  a^. 
It  is  derived  by  completing  the  square  on  |  with  respect  to  u^.  Thus 

2  2 
6  =  (uj^  -  up  +  2(Uj^  -  upCug  -  ^2? 

(5.11) 

I  =  (u^  -  involving  u^.  (5.12) 


By  equating  (5.12)  with  (5.11)  an  equation  for  u^  is  determined: 


u"  =  u' 

Ui 


<“2  -  “P-IZ 

“u 


(5.13) 


Likewise, 


-  '^1^^12 


(5.1^) 


The  optimal  control  is  determined  when  four  equations  are  simultane¬ 
ously  satisfied.  These  are  Eqs.  (5.13)  and  (5.1^)  along  with 


k 

if 

"i  >  “I 

'^l  =  ^ 

“I 

if 

a"  <  u^  < 

(5.15) 

“i 

if 

<  a" 

if 

u^>a^ 

U2=  < 

if 

a’  <  <  Qg 

(5.16) 

-“s' 

if 

u"  <  Of’ 

2  ^  2 

These  four  equations  can  be  solved  on  an  analog  computer  very  simply, 
but  this  is  not  much  help  here.  The  following  proofs  show  a  simple  way 
to  determine  the  optimal  values  of  u^  and  u^.  In  the  following  dis¬ 
cussion  u^  will  be  called  not  admissible  when  either  uj^  >  or 
u|  <  Ctp  Otherwise  it  will  be  called  admissible. 
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Tvo  cases  need  to  te  considered:  (l)  only  one  Is  not  admissible^ 
and  (2)  neither  is  admissible.  Before  beginning  two  inequality 
relations  involving  the  elements  of  a  symmetric  positive  definite  matrix 
must  be  stated.  These  can  be  foxuid  in  nearly  any  book  on  matrix  theory 
[Ref.  17]. 


>0,  (and  >  O), 


(5.17) 


^11^2  "  ^12  ^  °* 


(5.18) 


1.  Case  1 

The  first  case  occurs  when  one  u^  is  admissible  while  the 
other  is  not.  For  definiteness  let 


a‘<u^<0ii  (5.19) 

u^>a2.  (5.20) 


It  will  now  be  shown  that  the  optimal  value  of  Ug 
of  the  optimal  value  of  u^.  From  (5.13) 


-  UJ  = 


[-(^2  - 


‘11 


is  Qg 


regardless 


(5.21) 


The  quantity  in  brackets  is  positive  regardless  of  the  choice  of  Ug 
because  of  inequality  (5. 20).  Thus  (u£  -  u^)  has  the  same  sign  as  k^g. 

If  k^  is  negative  or  zero,  then  u^  <  u|,  and  either  u^  =  u^ 
or  u^  =  a".  Thus 


u^  <  Ui  <  u£. 


(5.22) 


Therefore 

0  >  Ui  -  u£  >  u^  -  u^. 


(5.23) 
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When  Is  negative,  ntultlplying  inequality  (5-23)  through  by 

k^/kgg  reverses  the  Inequality  signs.  The  result  is  of  course  trivial 
when  k^  is  zero. 


0^  -  ^ 


^2 


*^22 


(Ug  -  upk 

^11^22 


2 

12 


(5.2i^) 


The  right  side  of  (5*24)  is  determined  by  using  Eq.  (5.21). 

If  kj^  is  positive  or  zero,  the  seme  steps  outlined  in  the  last 
paragraph  can  be  taken,  and  the  result  is  again  (5.24). 

Eqviatlon  (5.24)  shows  that  the  last  term  in  Eq.  (5.l4)  is  positive 
or  zero,  regardless  of  the  sign  of  k^.  Using  (5.24),  Eq.  (5.l4)  can 
be  written  as  the  following  inequality: 


u^  >  + 


("2  - 
^11*^2 


(5.25) 


‘^12  \  ^12 

^11^22/  ^11*^22 


(5.26) 


2 

Since  ^]_2^^11^22  ^  ^  shown  by  (5.l8)>  the  term  in  parentheses  in 
(5.26)  is  positive.  Substituting  (5. 20)  into  (5.26)  makes  the  inequality 
even  stronger. 


(5.27) 


The  only  Up  that  satisfies  both  (5.27)  and  (5.16)  is  Ug  =  ttg.  Sub¬ 
stituting  ^  “  '^2  (5.27)  leads  to  a  contradiction. 

The  conclusion  is  that  if  u^  is  not  admissible  and  u^  (J  i) 
is  admissible,  the  optimal  value  of  u^  equals  the  nearest  bovind  to 
u^,  while  Uj  is  determined  from  u”. 

2 .  Case  2 


The  second  case  occurs  when  neither  u£  nor  u^  is  admissible. 
For  definiteness  let 
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(5.28) 


>oi; 


(5.29) 


Equations  (5.13)  and  (5.1**)  written  in  the  form  of  (5.2l)  show  that 
both  (u^  -  u£)  and  (u^  -  u^)  have  the  same  sign  as 


If  la  positive  or  zero  then 


u”  > 


(5.30) 


>  ttg 


(5.31) 


Thus  the  optimal  control  is  ^^2  ”  °^2' 

If  k^  is  negative  the  situation  is  much  more  complicated.  The 
values  of  u^  and  Ug  given  by  (5.28)  and  (5.29)  determine  that  either 
u-  =  or  u  =  dt  or  both.  This  will  be  proved  next. 

The  proof  assumes  that  both  the  optimal  Ug  <  Og  and  u^  < 
occur  simultaneously,  and  arrives  at  a  contradiction.  Since  both  condi¬ 
tions  cannot  occur  simultaneously,  at  least  one  u^^  must  equal  a^. 

If  u^  <  a^,  then  either  u^  =  u^  or  u^^  =  a".  Thus 


u£  >  Uj^  >  u^. 


(5.32) 


Combining  (5.32)  with  (5.13)  gives 

0  >  "l  ’  4  ^ - k 


(Ug  -  upk^ 


(5.33) 


11 


The  direction  of  the  Inequality  is  changed  when  (5.33)  is  multiplied 
throu^  by  the  negative  queuitity  k^^/kgg. 


,  ,  -  '^>‘12  .  ("2  -  “2)“^ 

0  <  — c - s  — iEr:F; — 


(5.3**) 


11  22 


Using  (5.3**)^  Eq.  (5.1**)  becomes  inequality  (5.35): 

(u^  - 

>  U*  +  \  fc-  - 

^  2  *11^22 


(5.35) 
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12 


Since  ^2  ^  “2  ^5.29)>  inequality  (5.36)  becomes 


u^>U2 


1  - 


4  \ 

\l*^22/ 


12 

■*■  W  v 
'^11*^22 


(5.36) 


(5.37) 


which  contradicts  the  original  assumption  that  u^  <  Ctg  (and  thus 
<  Ug).  Thvis  the  proof  is  conqplete. 

Since  it  is  not  known  which  u^  =  calculate  u^  on  the 
assumption  that  vu  =  a^,  and  calculate  u^  on  the  assumption  that 
Uj^  =  If  both  u^'  >  then  both  assxanptions  were  correct  and  the 

optimal  control  is  determined. 

At  least  one  assumption  was  correct,  thus  at  least  one  u'^  was 
computed  correctly.  Assume  ~  correct  but  u^  =  Of^  is  in¬ 

correct;  then  u^  is  correct  but  not  u^.  The  last  step  is  to  prove 
that  u^  >  even  though  it  wais  computed  using  an  incorrect  assunqption. 

Thiis 


u^  =  u^  - 


(a^  -  u')k 


2'  12 


"11 


(5.38) 


and 


>  u; 


2' 


(5.39) 


Again  since  k^  is  negative, 

(a^  -  u^)k^  (u^  -  upk^ 


0  < 


< 


(5.1*0) 


"U 


"u 


Substituting  (5.1*0)  into  (5.38)  gives 

(ug  -  u^)lc^  (Oj^  -  + 

ul'  >  u4  -  — !r.  =11'+  —  .  . -  >  alT. 

X  *  K-  .1  i  K-i  T  X 


(5.1*1) 


"11  22 


To  sum  up,  the  above  proofs  show  a  siiiq>le  way  to  calculate  the 
control  that  minimizes  the  quadratic  function  given  by  Eq.  (5.11): 
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1. 


2. 


Case  1.  If  is  not  admissible  but  Is  admissible,  then 
equals  the  bound  nearest  to  u^,  and  u^  Is  determined  from 


u'!.  The  value  of  u" 

J  ^  U 

of  Uj^,  namely  or 


Is  calculated  using  the  known  optimal  value 


Case  2.  If  neither  u£  nor  u^  Is  admissible,  then  u^  Is  com¬ 
puted  using  the  bound  nearest  to  for  Ug,  and  Ug  Is  likewise 
computed.  The  optimal  control  Is  then  determined  from  Eqs.  (5>15) 
and  (^.l6). 


E.  OPTIMAL  COMTROL  FORMULAS  AND  RECURRENCE  RELATIONS 

The  recurrence  relations  \4ien  neither  control  is  saturated  are  the 
same  as  the  unsaturated  control  recurrence  relations  given  in  Chapter 
III.  When  both  controls  are  saturated,  the  relations  eure  the  same  as 
the  saturated  control  recxirrence  relations  in  Chapter  III,  thou^  written 
in  vector  notation.  Thus  the  only  new  recurrence  relations  are  for  the 
case  in  \diich  one  control  is  saturated  and  the  other  Is  not. 

Assume  the  optimal  control  is  given  by  Uj^(O)  »  u^(0)  and  UgCo)  = 
ttg.  The  derivation  of  u^(0)  and  the  recurrence  relations  begins  by 
con^jleting  the  square  on  Uj^(O),  assuming  UgCO)  =  Og  (where  as  usual 
Og  represents  either  a*  or  a”).  Note  that  completing  the  square 
on  u^(0)  Is  not  at  all  the  same  as  completing  the  square  on  the  vector 
u(0). 

The  elements  of  the  control  vector  u(0)  are  separated  in  Eq.  (5*3) 
by  partitioning  the  A  matrix  as  follows: 

A=(A^iA2),  (5.42) 

where  the  A^  (l  =  1,2)  are  (m  x  l)  colvmm  matrices.  With  this 
partitioning,  Eq.  (5*3)  is  written  as 
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^  u^(0)  ^  ■*■  ■*■  “2’'22 

+  [®z(0)  +  A^u^(o)  +  +  Pjj)[<I»z(0)  +  A^u^(o)  +  AgOg] 

+  [4.z(o)  +  \n^iO)  +  V2^\ 

+  4[*z(0)  +  A^u^(O)  +  A^a^]  +  Cjj}  .  {^.k3) 


Completing  the  square  on  u^(0)  gives 

'  a-i  ..^S)<  *  ’■ill 


where 


+  [<l.z(0)  +  A^a^  f(Ql  +  Pjj)[«I»z(0)  +  A^a^l  + 

+  [<l»z(0)  +  A2a2]\  +  RS5^i(0)  +  A2a2]  +  Cj^},  (5.^^*) 


u’'(0)  =  A*  z(0)  +  B' 

■‘■Nfl  -4^+1 

A*«l  .  P„), 


h+1  +  Pj^)A^  +  7 


(5.i+5) 


(5.46) 


11 


B 


4i+i 


mined  in  the  same  manner. 


^(9  *  P,)A,  *  rii 


•Nfl 


(5.47) 

are 

deter- 

‘^N+1 

u^  and 

=  «2 

are  determined  by  equating  the  assumed  form  of  ]  given  by 
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Eq.  (3«20)  with  Eq.  (^.44).  These  recurrence  relations  are 


P„)(«  * 

°»H  “  *  V^“2  *  * 


(S.W) 

(5.»t9) 


+  OgfA^CQ  +  Pjj)A2  +  >22  ] 


(5.50) 


As  shown  In  Chapter  III,  the  existence  of  these  recurrence  relations 
shows  that  the  form  assumed  for  correct. 

The  equations  and  recurrence  relations  for  the  two-input  control 
case  are  summarized  at  the  end  of  this  chapter. 

F.  COMPUTING  METHOD 

Computing  proceeds  as  in  Chapter  IV  with  only  a  few  changes.  The 
first  step  is  to  calculate  the  region  of  linear  control.  Next  the  ei^t 
regions  of  first  saturation  are  calculated.  These  are  the  regions  com¬ 
puted  from  the  linear  region  that  have  one  or  both  controls  saturated. 

From  each  of  these  regions  are  calculated  nine  more  regions,  regions 
with  the  nine  possible  combinations  of  the  controls.  Regions  are  computed 
in  this  manner  until  all  the  state  space  of  interest  is  covered. 

Certainly  the  two-dimensional  control  case  will  take  much  more  com¬ 
puting  time  and  storage  than  the  one-dlmensionEuL  case.  The  method 
outlined  here  could  conceivably  be  extended  to  higher-dimensional  control, 
but  the  complexity  increases  rapidly. 
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SUMMARY  OF  PRIMCIPAL  EQUATIONS  FOR  THD-INPUT  CONTROL 


I 


Optimal  return  fvmctlon 

]  =  £'^(0)Pj^{0)  +  /(O)Rjj  +  1^(0)  +  (3.20) 

Optimal  control 


u(0)  =  u'(0) 

if 

a“  <  u'(0)  <  a'*’ 

(vuasaturated) 

u(0)  =  a 

If 

u^{0)  > 

or 

u^(0)  <  a" 

and 

A 

0 

or 

u^(0)  <  a’ 

( saturated) 

Uj^(O)  =  -x 

If 

r  u^(0)  > 

or 

u^(0)  <  a* 

and 

Uj(0)  =  u^(0)J 

\a‘  <  u'j(o) 

(mixed) 

Def init Ions 


U'(0)  = 

Ajj^l  =  +  r]‘^A’^(Q  +  Pj,)<D 

Bu+i  =  -[^^(Q  + 

A  =  (A,  I  A^) 

u':(o)  =  a;  z(o)  +  b*  ,  j  =  i,  2 

'^N+l  '^N+1 

A^(Q  +  PJ4> 

a;  -  -  - 

Jn.1  A^(Q  .  Pj,)Aj  ^  7jj 

B'  :=  ^J^N *  ^N^^l°l  ^  ^12°^! 

^j(9  *  V-^J  * 

SUDAER-lli8  .  1«2  . 


(3.25) 

(3.23) 

(3.24) 
(5.42) 

(5.45) 

(5.46) 

(5.47) 


1 


Iteeunrenee  relations 


VI.  EXAMPLES 


In  this  chapter  four  examples  of  optimal  control  systems  computed 
using  the  method  described  in  Chapter  IV  are  presented.  Some  optimal 
trajectories  in  state  space  are  eulso  shown.  In  the  final  section  the 
synthesis  of  the  systems  is  discussed. 

A.  EXAMPLE  A 

For  the  first  example  consider  the  space  vehicle  described  in  Chapter 
II.  The  optimal  control  that  minimizes  the  attitude  sum-squared-error 
from  any  initial  attitude  error  and  its  rate  of  change  is  to  be  found. 
Since  power  consianption  is  an  important  design  consideration,  the  toteO. 
energy  used  in  controlling  the  vehicle  is  charged  by  including  sum- 
squared -control  in  the  performance  criterion. 

All  parameters  are  normalized  to  unity,  and  the  sampling  interval 
is  arbitrarily  set  at  t  =  1.  The  resulting  system  is  shown  in  Fig.  5- 


FIG.  S.  BLOCK  OIAGBAM  OF  SYSTEM  FOR  EXAMPLES  A-C. 


The  state-transition  equations,  as  derived  in  Chapter  II,  are 


■»  • 

1.0  1.0 

0.5 

+ 1)  = 

z(n)  + 

0.0  1.0 

1.0 
■*  -m 

u(n),  (6.1) 


where  control  is  limited  by 


-1.0  ^  u(n)  ^  +1.0. 


(6.2) 
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The  performance  Index  la 


»=1 


atiii  thus 


(6.3) 


(6.4) 


The  optimal  control  for  this  example  is  shown  in  Fig.  6.  This  figure 
shows  the  state  space  divided  into  three  main  parts.  In  the  upper  area 
of  the  figure  the  optimal  control  is  u  =  -1.  This  area  is  composed  of 
all  the  alpha-minus  regions  that  were  calculated  using  the  method 
described  in  Chapter  IV.  The  boundaries  of  these  regions  are  not  shown 
in  the  figure,  since  they  represent  information  that  is  unnecessary  to 
the  synthesis  of  the  system. 


FIG.  6.  OPTIMAL  CONTROL  FOR  EXAMPLE  A. 
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Through  the  middle  of  Fig.  6  are  the  regions  in  which  the  control 
is  optimally  unsaturated,  with  the  region  of  linear  control  in  the  very 
center.  The  feedback  parameters  for  these  regions  are  given  in  Table  1, 
and  the  optimal  control  for  each  of  these  regions  is  given  by 

u(0)  =  Aj^3^z(C')  +  =  a^z^(O)  +  a2Z2(0)  +  b.  (6.5) 

TABLE  1.  FEEDBACK  COEFFICIENTS  FOR  EXAMPLE  A 


Region 

No. 

A--  . 

N+1 

®N+1 

^2 

b 

01 

-0.50000 

-1.00000 

0.00000 

02 

-.43902 

-1.12195 

-0.48780 

03 

-.35556 

-1.13333 

-0.92222 

04 

-.29240 

-1.12281 

.1.32749 

05 

-.24658 

-1.10959 

-1.71918 

06 

-.43902 

-1.12195 

0.48780 

07 

-.35556 

-1.13333 

0.92222 

08 

-.29240 

-1.12281 

1.32749 

09 

-.24658 

-1.10959 

1.71918 

The  alpha-plus  regions,  where  the  optimal  control  is  u  =  +1,  are 
shown  as  the  lower  part  of  Fig.  6.  Thus  the  optimal  control  is  determined 
for  every  point  in  the  state  space  shown. 

Figure  6  also  shows  an  optimal  trajectory  starting  from  initial  con- 

fn  ^ 

dition  £'*'(0)  =  [11  -4].  The  cost  for  this  initial  z(0)  can  be 

computed  either  by  using  Eq.  (6.3)  or  by  using 

=  /(O)Pj^(O)  +  /(O)Rj^  +  R^(0)  +  Cjj,  (6.6) 

where  for  the  region  containing  the  particular  z(0)  used  here. 
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(6.7) 


1.56164 

1.02740 

1.02740 

1.62329 

-3.9^521 

-0.25342 

28.7*t317. 


Calculated  either  way  the  optimal  cost  is  68.497. 

[Note  that  the  nianber  N — the  number  of  steps  to  go— is  always  in¬ 
finitely  large,  since  in  this  problem  there  are  always  an  infinite  number 
of  steps  to  go.  However,  since  it  is  necessary  to  be  able  to  distinguish 
between  the  N-stage  process  and  the  (Nfl) -stage  process,  the  symbol  for 
infinity  will  not  be  used  to  replace  N  in  Pjj,  Rjj,  C^j,  or 

Eq.  (6.6).  J 

In  Chapter  IV  It  is  stated  that  the  matrix  ((b  +  is  nonsingular 

when  r  0.  For  the  region  of  linear  control  in  this  example  (the  01 
region  in  Fig.  6)  this  matrix  is 


(a>  1-  = 


0.75 

-0.50 


0.50 


0.00 


(6.8) 


The  determinant  of  this  matrix  is  0.25,  and  thus  the  matrix  is  nonsingular 
as  predicted. 

Example  A  might  have  been  solved  using  dynamic  programming,  a  general 
computing  method  that  is  able  to  solve  a  wide  variety  of  problems,  many 
of  which  can  be  solved  in  no  other  way.  However,  the  special  method 
used  to  compute  this  example  needed  much  less  memory  storage  than  dynamic 
programming  would  have  required.  The  state-space  grid  over  which  this 
exanple  was  computed  contained  about  30,000  points.  A  careful  use  of 
symmetry  might  have  reduced  this  to  about  20,000  points;  even  so,  dynamic 
programming  would  have  required  at  least  60,000  words  of  storage. 

The  number  of  words  used  by  the  method  of  Chapter  IV  cannot  be  stated 
as  a  fvmction  of  the  size  of  the  state-space  grid,  since  this  number 
depends  on  whether  all  iinused  z(l)  regions  are  stored  in  the  fast  mem¬ 
ory  or  on  tape,  on  whether  ^(l)  is  ceuLculated  from  z(0)  or  vice  versa, 
and  on  how  the  regions  are  stored.  In  computing  this  example,  only  the 
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bolindarles  of  the  regions  were  stored— a  technique  that  cannot  be  used 
In  dynamic  progranulng— and  all  the  information  about  even  the  largest 
region,  the  region  of  linear  control,  was  stored  in  less  than  200  words. 
The  entire  program  used  only  a  few  thousand  words  of  memory,  about  one- 
tenth  as  many  as  would  have  been  required  by  dynamic  programming. 

B.  EXAMPLE  B 

Consider  the  same  system  as  used  in  Example  A.  The  sampling  Interval 
is  still  T  =  1,  but  the  performance  index  is  now 

OO 

J^[z(0)]=^  [z^(n)  +  z^(n)].  (6.9) 

n=l 


Thus 


T  ■  1.0 


FIG.  7.  OPTIMAL  CONTHOL  FOR  EXAMPLE  B. 
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The  optimal  control  Is  shown  in  Fig.  7  along  with  an  optimal 
trajectory.  An  extended  picture  of  the  optimal  control  Is  given  in  Fig. 
8.  Because  the  bounds  on  the  control  are  symmetrical,  the  regions 


FIG.  8.  EXTENDED  REGION  OF  OPTIMAL  CONTROL  FOR  EXAMPLE  B. 


are  symmetric  with  respect  to  the  origin,  and  thus  only  half  the  \m- 
saturated  regions  are  shown  in  Fig.  8.  The  feedback  parameters 
of  symmetric  regions  have  the  same  value  and  sign,  while  the 
have  the  opposite  slgii.  The  feedback  parameters  for  both  Figs.  7  and  8 
are  given  in  Table  2. 

T 

The  optimal  trajectory  shown  in  Fig.  7  begins  at  z  (o)  =  [-6.0  5-5]. 

The  cost  associated  with  this  initial  condition  can  be  computed  by  using 
either  Eq.  (6.6)  or  (6.9).  In  either  case  the  cost  is  hk^.k^k,  irtiere 
for  the  alpha-minus  region  containing  z(0). 


(6.U) 
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TABLE  2.  FEEDBACK  COEFFICIEIIIS  FOR  EXAMPLE  B 


Region 

No. 

Sfl 

®N+1 

*■1 

^2 

b 

01 

-0.66667 

-1.33333 

0.00000 

02 

-.47058 

-1.23529 

-0.44118 

03 

-.36145 

-1.18072 

-0.85542 

04 

-.29268 

-1.14634 

-1.25610 

0? 

-.24561 

-1.12281 

-1.64912 

06 

-.21145 

-1.10573 

-2.03744 

07 

-.18557 

-1.09278 

-2.1^68 

08 

-.16528 

-1.08264 

-2.80578 

09 

-.14898 

-1.07449 

-3.18735 

10 

-.14792 

-1.07396 

-3.21698 

11 

-.13559 

-1.06779 

-3.56779 

It  is  proved  in  Chapter  IV  that  if  r  =  0  the  matrix  (*  * 
is  singular.  For  this  example  the  matrix  for  the  region  of  linear  con¬ 
trol  is 


2/3  1/3 

-2/3  -1/3 


(6.12) 


which  is  certainly  singular.  Direct  calculation  shows  that  for  any  of 
the  Aj^^  calculated  in  this  example  this  matrix  is  singular. 

Computing  the  minimum  cost  from  Eq.  (6.9)  requires  summing  an  infinite 
series.  This  is  particularly  easy  in  this  case  where  the  matrix  (6.12) 
is  singular.  If  z{0)  is  in  the  region  of  linear  control,  then  the 
z(n)  (n  =  1,  2,  ...)  adways  lie  on  a  line  throu^  the  origin,  in  this 
case  the  line  with  slope  -1  shown  dashed  in  Fig.  7>  Since  z^(n)  = 
using  (6.12)  with  (6.I)  shows  that  each  z(n  +  l)  is  given  by 
the  geometrical  progression 

z(n  +  1)  =  n  -  1,  2,  ...  (6.13) 
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Thus  the  optimal  cost  is  given  by 


zf(l)  +  zf(l)  2  ^  1  9z^(l) 


C.  EXAMPLE  C 

This  example  shows  the  effect  of  increasing  the  sampling  rate.  The 
system  is  the  same  as  in  Example  but  the  sampling  Interval  t  =  0.1 
is  one -tenth  as  long. 

As  shown  in  Fig.  9>  the  band  of  regions  of  unsaturated  control  is 
much  narrower  than  in  Fig.  7  of  Example  B.  Since  the  regions  are  much 
smaller,  there  are  many,  many  more  of  them.  There  are  over  50  regions 
of  unsaturated  control  on  each  side  of  the  region  of  linear  control  in 
the  state  space  shown  in  Fig.  9-  The  boundaries  separating  these  regions 
are  not  shown  because  they  are  so  close  together. 

The  optimal  control  for  the  region  of  linear  control  is  given  by 


u  «  (-9.52382)z3^  +  (-10.47619)22. 


(6.15) 


for  the  region  of  linear  control  was  calculated,  beginning 
with  Pq  =  0,  by  the  iteration  method  discussed  in  Chapter  IV. 
in  Example  B  took  eight  iterations  to  converge  to  six  significeuit  figures 
in  Example  C  it  took  about  80  iterations  to  converge  to  the  same  number 
of  figures. 


D.  EXAMPLE  D 

This  example,  perhaps  the  most  interesting  presented  here,  considers 
the  artificial  earth  satellite,  including  the  external  force  due  to  the 
gravity  gradient,  discussed  in  Chapter  II.  All  parameters  are  normalized 
and  the  sampling  interval  is  arbitrarily  and  somewhat  unrealistically  set 
at  T  =  1.  Thus  the  state-vector-transition  equation  is 


z(n  +  1) 


,54030  0.84i47 


.84147  0.54030 


rl  [0.45970 

z(n)  I  ui 

)  "  0.84i47 

W  ^  • 


(n)  (6.16) 
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FIG.  9.  OPTIMAL  CONTROL  FOR  EXAMPLE 


where  the  control  1b  limited  by  Eq.  (6.2).  This  system  is  shown  In 
Fig.  10. 


FIG.  10.  BLOCK  DIAGRAM  OF  SYSTEM  FOR  EXAMPLE  D. 


It  is  desired  to  make  the  attitude  integral-squared-error  a  minimum, 
and  as  an  approximation  the  attitude  sum-squared-error  will  be  minimized. 
(This  approximation  will  unfortunately  cause  a  phenomenon  known  as  Inter¬ 
sample  ripple,  as  will  be  shown  later.)  There  is  no  cost  on  the  control, 
thus  the  performance  index  is 

00 

J^[z(0)l=y  z^(n)  (6.17) 

n=l 


Therefore 


(6.18) 


For  the  puarpose  of  comparison,  a  simple  nonoptimal  system  is  shown 
in  Fig.  11.  This  system  uses  the  optimal  feedback  gains  a^  and  a^ 
of  the  region  of  linear  control,  and  thus  is  optimal  for  initial  condi¬ 
tions  close  to  the  origin  in  state  space.  If  the  total  output  of  these 
feedback  blocks  causes  saturation  of  the  input,  then  this  is  simply 
allowed  to  happen.  Thus  Fig.  11  shows  a  system  that  is  extremely  simple 
to  build  and  is  optimal  for  small  initial  conditions. 

The  optimal  control  regions  are  shown  in  Fig.  12;  the  optimal  feed¬ 
back  coefficients  for  the  vinsaturated  regions  are  given  in  Table  3* 
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TABLE  3.  FEEDBACK  COEFFICIENTS  FOR  EXAMPLE  D 


Region 

Ho. 

^1 

®H+1 

®1 

01 

-1.1753^ 

-1.83049 

0.00000 

02 

0.13289 

-1.11580 

-0.39044 

03 

0.35351 

-0.99526 

-0.62492 

04 

0.317*^3 

-1.01498 

-0.56883 

05 

0.49316 

-0.91898 

-0.86046 

06 

0.53957 

-0.89363 

-0.95780 

07 

0.55971 

-0.88262 

-1.01570 

06 

0.66825 

-0.82333 

-1.38211 

09 

0.60572 

-0.85749 

-1.16421 

10 

0.64942 

-0.83362 

-1.33612 

11 

0.66819 

-0.82336 

-1.42508 

12 

0.76747 

-0.76912 

-1.94701 

13 

0.71067 

-0.80016 

-1.64189 

14 

0.75047 

-0.77841 

-1.87443 

15 

0.63394 

-0.84207 

-1.17957 

16 

0.49844 

-0.91610 

1.96661 

These  feedback  coefficients  are  used  in  Eq.  (6.5)  to  determine  the 
optimal  control. 

An  optimal  trajectory  from  initial  condition  z  (O)  =  [-5.0  3*0] 

is  shown  in  Fig.  12.  This  InltleJ.  condition  is  in  alpha-minus  region  16 
as  shown  in  the  figure.  When  the  state  vector  reaches  the  region  of 
linear  control,  it  enters  a  limit  cycle  rather  than  going  to  the  origin. 
The  value  of  z^(n)  at  the  sampling  Instants  Is  zero  in  this  limit 
cycle,  thus  no  cost  is  charged  to  the  perfonnance  index  (6.17).  ^wever, 
the  attitude  error  z^(t)  is  zero  only  at  the  sampling  instants,  and 
thus  there  exists  a  phenomenon  known  eus  Intersample  ripple  [Ref.  15]. 
This  ripple  can  be  ellnd-nated  during  the  design  of  the  system  by  adding 
a  charge  on  either  ZgCn)  or  u(n  -  l). 
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The  cost  of  the  trajectory  shown  In  Fig.  12,  as  detenolned  either 
by  (6.17)  or  by  (6.6),  is  32.702.  This  cost  is  most  easily  determined 
by  (6.6),  where 

2.9U7579  -0.831075 

"  -0.831075  4.033005 

'  9.389736 
-9.400775 

for  the  region  containing  z(0) --alpha-minus  region  16  in  Fig.  12. 

The  control  for  the  nonoptimal  system  of  Fig.  11  is  given  in  Fig.  13 . 
Between  the  two  parallel  lines  the  control  is  given  by 


Cjj  =  48.085094  (6.19) 


u  =  (-1.175343)2^  +  (-1.830488)z2  (6.20) 

where  the  two  lines  are  determined  by  setting  u  =  +1  in  (6.20).  The 

T 

trajectory  shown  in  Fig.  13  from  initial  condition  z  (O)  =  [-5.0  3.0] 

has  a  cost  determined  from  Eq.  (6.17)  of  73.154,  an  increase  of  124  per¬ 
cent  over  the  optimal  system.  Thus,  though  considerably  more  complicated 
to  mechanize,  the  optimal  system  is  a  substantial  improvement  over  the 
simple  system  of  Fig.  11. 

E.  THE  SYNTHESIS 

The  optimal  design  of  a  system  is  often  used  only  as  a  standard  of 
comparison  for  the  system  that  is  actually  built.  However,  if  the  truly 
optimal  system  is  to  be  synthesized,  the  feedback  coefficients  of  the 
unsaturated  regions  can  be  stored  in  a  special-purpose  digital  computer. 
The  computer  takes  the  value  of  the  state  vector  at  the  sampling  instant 
and  decides  whether  the  control  u  is  optimally  a^,  a",  or  unsaturated. 
If  u  is  optimally  xinsaturated,  the  computer  determines  which  unsatu- 
rated  region  the  state  vector  is  in,  and  computes  the  control  using  Eq. 
(6.5). 

Approximations  to  the  optimal  system  can  be  made  with  varying  degrees 
of  accuracy.  For  example,  some  of  the  xansaturated  regions  can  be  com¬ 
bined  into  one  region  with  little  deviation  from  the  optimal  cost;  or  all 
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FIG.  13.  CONTROL  FOR  SYSTEM  OF  FIG.  11. 


of  the  unsaturated  regions  can  be  eliminated,  using  their  location  as 
a  guide  to  the  placement  of  a  piecewise  linear  switching  curve.  Around 
the  origin  in  state  space,  however,  the  control  must  be  linear  if  the 
system  is  to  return  to  equilibrium. 
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